Reversal Learning in Humans and Gerbils: Dynamic Control Network Facilitates Learning
نویسندگان
چکیده
Biologically plausible modeling of behavioral reinforcement learning tasks has seen great improvements over the past decades. Less work has been dedicated to tasks involving contingency reversals, i.e., tasks in which the original behavioral goal is reversed one or multiple times. The ability to adjust to such reversals is a key element of behavioral flexibility. Here, we investigate the neural mechanisms underlying contingency-reversal tasks. We first conduct experiments with humans and gerbils to demonstrate memory effects, including multiple reversals in which subjects (humans and animals) show a faster learning rate when a previously learned contingency re-appears. Motivated by recurrent mechanisms of learning and memory for object categories, we propose a network architecture which involves reinforcement learning to steer an orienting system that monitors the success in reward acquisition. We suggest that a model sensory system provides feature representations which are further processed by category-related subnetworks which constitute a neural analog of expert networks. Categories are selected dynamically in a competitive field and predict the expected reward. Learning occurs in sequentialized phases to selectively focus the weight adaptation to synapses in the hierarchical network and modulate their weight changes by a global modulator signal. The orienting subsystem itself learns to bias the competition in the presence of continuous monotonic reward accumulation. In case of sudden changes in the discrepancy of predicted and acquired reward the activated motor category can be switched. We suggest that this subsystem is composed of a hierarchically organized network of dis-inhibitory mechanisms, dubbed a dynamic control network (DCN), which resembles components of the basal ganglia. The DCN selectively activates an expert network, corresponding to the current behavioral strategy. The trace of the accumulated reward is monitored such that large sudden deviations from the monotonicity of its evolution trigger a reset after which another expert subnetwork can be activated-if it has already been established before-or new categories can be recruited and associated with novel behavioral patterns.
منابع مشابه
Iterative learning identification and control for dynamic systems described by NARMAX model
A new iterative learning controller is proposed for a general unknown discrete time-varying nonlinear non-affine system represented by NARMAX (Nonlinear Autoregressive Moving Average with eXogenous inputs) model. The proposed controller is composed of an iterative learning neural identifier and an iterative learning controller. Iterative learning control and iterative learning identification ar...
متن کاملCreating Dynamic Sub-Route to Control Congestion Based on Learning Automata Technique in Mobile Ad Hoc Networks
Ad hoc mobile networks have dynamic topology with no central management. Because of the high mobility of nodes, the network topology may change constantly, so creating a routing with high reliability is one of the major challenges of these networks .In the proposed framework first, by finding directions to the destination and calculating the value of the rout the combination of this value with ...
متن کاملCreating Dynamic Sub-Route to Control Congestion Based on Learning Automata Technique in Mobile Ad Hoc Networks
Ad hoc mobile networks have dynamic topology with no central management. Because of the high mobility of nodes, the network topology may change constantly, so creating a routing with high reliability is one of the major challenges of these networks .In the proposed framework first, by finding directions to the destination and calculating the value of the rout the combination of this value with ...
متن کاملDynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)
In this paper we focus on the application of reinforcement learning to obstacle avoidance in dynamic Environments in wireless sensor networks. A distributed algorithm based on reinforcement learning is developed for sensor networks to guide mobile robot through the dynamic obstacles. The sensor network models the danger of the area under coverage as obstacles, and has the property of adoption o...
متن کاملThe reversal effect of mefenamic acid in the sporadic model of Alzheimer's disease in rat: a behavioral analysis
Alzheimer’s disease (AD) is a chronic neurodegenerative disease causing progressive impairment of memory and cognitive function. Streptozotocin (STZ) injection into the brain is known to cause cognitive impairment in rats and is similar to sporadic AD in humans. Several lines of evidence have indicated that an inflammatory process contributes to the pathology of AD. On the basis of the results ...
متن کامل